Spam Filtering using Contextual Network Graphs
ثبت نشده
چکیده
This document describes a machine-learning solution to the spam-filtering problem. Spam-filtering is treated as a text-classification problem in very high dimension space. Two new text-classification algorithms, Latent Semantic Indexing (LSI) and Contextual Network Graphs (CNG) are compared to existing Bayesian techniques by monitoring their ability to process and correctly classify a series of spam and non-spam documents. LSI and CNG algorithms have an advantage over the Naïve Bayes classifier in the domain of natural language processing as they include a representation of ‘context’, or relations between terms. Both LSI and CNG take advantage of these relations to offer a conceptual or semantic-based search, which has been adapted in this paper to the domain of spamfiltering.
منابع مشابه
Spam Source Clustering by Constructing Spammer Network with Correlation Measure
Spam filtering is one of the most challenging problems in electric message systems. In general, recent studies on specifying real spam source are based on content filtering because spammers usually falsify their origin. We propose a method to specify spam source based on structural analysis with complex network. We assume that each spam sources either has the same victim list or uses the same s...
متن کاملFiltering Network Spam Message using Approximated Logistic Regression
The development of telecom network and Internet provides effective ways for communication. As an important way in communication, Short Messaging Service (SMS) via both telecom network and Internet has played an increasing important role in daily life. However, it usually suffers from spam SMS that causes misunderstanding and cheat. The highly varying content, network environment make the identi...
متن کاملDetecting Image Spam Using Image Texture Features
Filtering image email spam is considered to be a challenging problem because spammers keep modifying the images being used in their campaigns by employing different obfuscation techniques. Therefore, preventing text recognition using Optical Character Recognition (OCR) tools and imposing additional challenges in filtering such type of spam. In this paper, we propose an image spam filtering tech...
متن کاملChinese Spam Filtering Based On Back-Propagation Neural Networks
As the email service is becoming an important communication way on the Network, the spam is increasing every day. This paper describes a new filtering model based on email content by using Back-Propagation Neural Networks (BPNN). And for the Chinese email, it uses Natural Language Processing & Information Retrieval Sharing Platform (NLPIR) system to perform Chinese word segmentation. The simula...
متن کاملAn Effective Model for SMS Spam Detection Using Content-based Features and Averaged Neural Network
In recent years, there has been considerable interest among people to use short message service (SMS) as one of the essential and straightforward communications services on mobile devices. The increased popularity of this service also increased the number of mobile devices attacks such as SMS spam messages. SMS spam messages constitute a real problem to mobile subscribers; this worries telecomm...
متن کامل